Deep Information Propagation

نویسندگان

  • Samuel S. Schoenholz
  • Justin Gilmer
  • Surya Ganguli
  • Jascha Sohl-Dickstein
چکیده

We study the behavior of untrained neural networks whose weights and biases are randomly distributed using mean field theory. We show the existence of depth scales that naturally limit the maximum depth of signal propagation through these random networks. Our main practical result is to show that random networks may be trained precisely when information can travel through them. Thus, the depth scales that we identify provide bounds on how deep a network may be trained for a specific choice of hyperparameters. As a corollary to this, we argue that in networks at the edge of chaos, one of these depth scales diverges. Thus arbitrarily deep networks may be trained only sufficiently close to criticality. We show that the presence of dropout destroys the order-to-chaos critical point and therefore strongly limits the maximum trainable depth for random networks. Finally, we develop a mean field theory for backpropagation and we show that the ordered and chaotic phases correspond to regions of vanishing and exploding gradient respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Proximal Alternating Direction Network: A Globally Converged Deep Unrolling Framework

Deep learning models have gained great success in many real-world applications. However, most existing networks are typically designed in heuristic manners, thus lack of rigorous mathematical principles and derivations. Several recent studies build deep structures by unrolling a particular optimization model that involves task information. Unfortunately, due to the dynamic nature of network par...

متن کامل

How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation

We propose to exploit reconstruction as a layer-local training signal for deep learning. Reconstructions can be propagated in a form of target propagation playing a role similar to back-propagation but helping to reduce the reliance on derivatives in order to perform credit assignment across many levels of possibly strong nonlinearities (which is difficult for back-propagation). A regularized a...

متن کامل

Simulation of static sinusoidal wave in deep water environment with complex boundary conditions using proposed SPH method

The study of wave and its propagation on the water surface is among significant phenomena in designing quay, marine and water structures. Therefore, in order to design structures which are exposed to direct wave forces, it is necessary to study and simulate water surface height and the wave forces on the structures body in different boundary conditions. In this study, the propagation of static ...

متن کامل

Quantitative Comparison of Analytical solution and Finite Element Method for investigation of Near-Infrared Light Propagation in Brain Tissue Model

Introduction: Functional Near-Infrared Spectroscopy (fNIRS) is an imaging method in which light source and detector are installed on the head; consequently, re-emission of light from human skin contains information about cerebral hemodynamic alteration. The spatial probability distribution profile of photons penetrating tissue at a source spot, scattering into the tissue, and being released at ...

متن کامل

Co-salient Object Detection Based on Deep Saliency Networks and Seed Propagation over an Integrated Graph

This paper presents a co-salient object detection method to find common salient regions in a set of images. We utilize deep saliency networks to transfer co-saliency prior knowledge and better capture high-level semantic information, and the resulting initial co-saliency maps are enhanced by seed propagation steps over an integrated graph. The deep saliency networks are trained in a supervised ...

متن کامل

Transition-Based Deep Input Linearization

Traditional methods for deep NLG adopt pipeline approaches comprising stages such as constructing syntactic input, predicting function words, linearizing the syntactic input and generating the surface forms. Though easier to visualize, pipeline approaches suffer from error propagation. In addition, information available across modules cannot be leveraged by all modules. We construct a transitio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1611.01232  شماره 

صفحات  -

تاریخ انتشار 2016